Aphelenchus avenae genome highlights evolutionary adaptation to desiccation | Communication Biology

2021-11-16 08:03:11 By : Ms. Winnie Wei

Thank you for visiting Nature. The browser version you are using has limited support for CSS. For the best experience, we recommend that you use a newer version of the browser (or turn off the compatibility mode in Internet Explorer). At the same time, to ensure continued support, we will display sites without styles and JavaScript.

Communication Biology Volume 4, Article Number: 1232 (2021) Cite this article

The author's correction to this article was published on November 12, 2021

This article has been updated

Some organisms can withstand complete loss of body water (loss of up to 99% of body water) and maintain a metabolic state for decades until rehydration. This is called dehydrated organism. Few multicellular eukaryotes in the adult stage can survive without water. We do not yet fully understand the mechanism by which dehydrated metazoans survive. Here, we report the 255-Mb genome of Aphelenchus avenae, which can tolerate relative zero humidity for many years. Gene duplication appears in the whole genome and promotes the expansion and diversification of 763 kinases, which is the second largest metazoan kinase group so far. Transcriptome analysis of the metabolic status of A. avenae showed that ATP levels increase in the early stage of dehydration biochemistry, which is used for the global circulation of macromolecules and enhances autophagy. We catalogued 74 specific intrinsically disordered proteins that may help A. avenae survive through desiccation stress. Our research results have perfected the molecular basis for survival in extreme water loss and opened the way for the discovery of new anti-drying strategies.

Although water is essential to life, at certain stages of the life cycle (for example, metazoan larvae or/and adult stages), certain organisms in the three kingdoms of life can enter a metabolic state (called Dehydration biochemical) to withstand extreme water loss 1. Few metazoans, such as rotifers, tardigrades, and nematodes, can tolerate complete dehydration in adulthood. Decode the genomes of bdelloid rotifers and tardigrades 2,3,4, and provide unique proteins of rotifers and tardigrades as protective molecules. Large-scale horizontal gene transfer (HGT) events were found in bdelloid rotifer Adineta vaga genomes 2,5, while extensive HGTs were not found in tardigrade Ramazzottius varieornatus and Hypsibius dujardini genomes 3,4,6. Progress has been made in understanding the molecular mechanisms by which these organisms enter dehydrated organisms7,8,9,10. However, important species-specific protective molecules and anti-desiccation mechanisms in nematode species remain to be determined.

Nematodes are a species-rich phylum, distributed in different habitats, such as terrestrial and aquatic ecosystems, as well as animal and human hosts. Certain other species in the suborder A. avenae, Tylenchina, and Antarctic nematodes (such as Plectus murrayi in the Chromadorea class) experience dehydration organisms, while species in other suborders can only tolerate dryness in the larval stage or larvae in the egg 11, 12, 13, 14, 15, 16. The genome and transcriptome of P. murrayi that inhabit the Antarctic ecosystem have recently been characterized17. As an anhydrous biological model, A. avenae has been able to escape cell death under relative zero humidity for many years. When there is water, the anhydrous A. avenae resumes normal metabolic activities, which indicates that A. avenae has developed a molecular shielding system to protect somatic cells during long-term dehydration18. A. avenae can only withstand a slow rate of water loss and cannot be pre-treated at 97% relative humidity19. Non-reducing trehalose and advanced embryogenesis-rich (LEA) proteins have been proposed to act synergistically to form gel-phase bioglass in the anhydrous organism A. avenae18, 20. It is not fully understood how genetic changes in the evolution of anhydrous nematodes can resist complete drying, and how A. avenae reprograms its cellular state to survive complete dehydration at different stages of dehydration.

Here, we report the high-quality genome of A. avenae and its transcriptome changes in response to gradual water loss. Comparative genomic and transcriptomics analysis will help to understand the general and species-specific molecular mechanisms of complex animals that survive extreme water loss.

Based on flow cytometry analysis, we estimated the genome size of A. avenae to be 255 Mb, which is 2.5 times the size of Caenorhabditis elegans. In order to decode the genes of anhydrous organisms, we used Roche/454 and Illumina platforms (Supplementary Table 1) to sequence the genome of A. avenae with approximately 180 times coverage. The K-mer count further supports a genome size of 255 Mb (Supplementary Figure 1). The assembly draft consists of 18,660 stents, totaling 264.8 Mb, with an N50 of 142 kb (n = 385), and a maximum stent length of 5.5 Mb (Figure 1a, Supplementary Table 2, Supplementary Note). A total of 218.6 Mb bases, representing 98% of the assembled genome sequence, covering more than 20 reads (Supplementary Figure 2), showing high single-base accuracy. The repetitive sequences cover 16.7% of the genome, most of which are uncharacterized repetitive sequences (Supplementary Table 3). The average density of single nucleotide polymorphisms (SNPs) is about three variants per kb (Figure 1a, supplementary note), confirming that A. avenae individuals are highly identical due to parthenogenesis. We have determined that the assembly is at least 97% complete, based on the mapping of the conservative eukaryotic cluster (KOG) (Supplementary Note). The genome integrity assessment based on BUSCO identified 236 (92.6%) of the 255 conserved BUSCO genes (Supplementary Note). In addition, all 5120 expressed sequence tags (EST) retrieved from NCBI can be mapped to the assembled scaffold, indicating the integrity of the assembly.

a. A map of the 23 largest scaffolds in the genome of A. avenae. (A) Circular visualization of the largest 23 stents. The axis number represents the size of the scaffolding span (Mb). (B) A gene model of the 23 largest scaffolds in the A. avenae genome. Red: Conserved genes in A. avenae, C. elegans, B. malayi, M. hapla, A. suum, B. xylophilus, H. contortus, and T. spiral nematode. Green: other conserved genes in A. avenae and C. elegans. Blue: A unique gene in A. avenae. (C) The predicted operon in the genome of A. avenae. (D) Heat map of up-regulated genes during oat drying stress. The color represents the value of FPKM (fragments per kilobase transcript length read per million mappings). (E) During the drying process of A. avenae, the histograms of the up-regulated genes (blue) and down-regulated (light red) of the above three-fold changes. (F) Heat map of down-regulated genes during drying of A. avenae. The color represents the FPKM value. (G) Duplications in the genome of A. avenae (changes per kb). (H) SNP distribution in the genome of A. avenae. (I) Genomic sequencing read coverage (average per 100 bases). Yellow: >100 times. Red: <40×. Blue: >40× but <100×. b The genetic gains and losses of the eight nematode species. The numbers of gene gain () and loss (-) are listed on the branch. The single or shared genes of the eight nematode species are shown in a pie chart.

We predicted 43,192 protein-coding genes, accounting for 15% of the genome, with an average gene density of 169 genes per Mb (Supplementary Table 2). The assembled transcriptome supports 80% (34,387) of gene models. Compared with 17% of Caenorhabditis elegans protein-coding genes organized into operons, 23.3% of A. avenae protein-coding genes were predicted in operons (Figure 1a, Supplementary Note, Supplementary Data 2). We assigned functions to 21,910 (50.7%) protein-coding genes. BLASTP identified 15,645 protein sequences shared by nematodes (36.2%) and 19,080 sequences (44.2%) as species-specific orphan genes (Supplementary Note).

To describe the emergence and extinction of gene families that emphasize species adaptation, we compared the genomes of eight nematode species, including A. avenae, C. elegans, B. malayi (causing lymphatic filariasis), Trichinella spinis (causing trichinella disease), and Meloidogyne hapla (causes root knots), Ascaris suum (causes ascariasis), pine wood nematode (causes pine blight) and Haemonchus contortus (causes ascariasis). Among the 8 nematode species, only 998 gene families are conserved, which means that the birth and death of gene families occur differently in Nematodes. The gene birth/death tree showed that compared with the other six nematode species, A. avenae and B. pine wood nematode shared the most gene families (Figure 1b). A total of 26,537 (61.4%) proteins are unique to A. avenae, and they play a role in the immune system, development process, and response to stimuli (Supplementary Figure 3). An additional comparison between A. avenae and clade IV species shows that 20,615 (47.7%) proteins are unique to A. avenae (Supplementary Note, Supplementary Table 7).

Next, we examined the genomic rearrangements between A. avenae and selected nematode species to reveal synlinear relationships that have evolved over millions of years. Compare scaffolds/chromosomes above 100 kb in pairs. Analyzing the highly conserved relationship between genome-wide alignment and clustering, we found that there is synlinearity between A. avenae-B. Pine wood nematodes (3.4%) were significantly higher than those between A. avenae-C. Nematode (0.5%) or A. avenae-M. hapla (0.5%) (Supplementary Figure 4). A. avenae and B. xylophilus share 60 collinear blocks, while A. avenae and C. elegans share 11 collinear blocks, indicating that A. avenae and B. xylophilus diverged later than A. avenae and C. elegans, which is the same as A. avenae and C. elegans. The closest relationship observed is the birth and death of a gene family. We observed seven concentric blocks between the largest stent (5.5 Mb) of A. avenae and chromosome III of C. elegans, including two inverted concentric blocks. The longest synline block has 19 genetic linkages, 1.76 Mb in A. avenae and 1.06 Mb in Caenorhabditis elegans. The synteny between A. avenae and C. elegans indicates the intrachromosomal rearrangement pattern, which is the common theme of the whole phyla.

The number of A. avenae genes is almost twice that of other sequenced nematode genomes, which may be due to a high degree of gene duplication. Compared with one-third of the gene duplications in the Caenorhabditis elegans genome 24,25, the degree of gene duplication in A. avenae increased to 48%. We identified 30 collinearity blocks, which contained 420 genes (1%) (Supplementary Figure 4, Supplementary Data 3), as well as 1216 (2.8%), 1146 (2.7%) and 17,942 (41.5%) genes Participate in repetitive events in tandem, near-end, and decentralization, respectively. Based on gene ontology (GO) analysis, the first three functions of replicating genes are assigned to GO terms with hydrolase activity, protein binding and nucleic acid binding.

To examine global gene expression in response to extreme dryness, we sequenced the transcriptome of A. avenae prepared under five conditions. The selected conditions include 100% relative humidity (rh, fresh control), 97% rh (35% moisture content), 85% rh (8% moisture content), 40% rh (7% moisture content) and 0% rh (1 % Water content) 19. The Illumina Hiseq 2000/2500 generated 492 million double-ended readings, each with a coverage of at least 43X. Principal component analysis and multidimensional scaling analysis both confirmed the global pattern of differential gene expression (DGE) caused by drying (Supplementary Figures 5 and 6). Comparing DGE under 97% rh conditions with DGE under fresh conditions, we observed 9211 up-regulated genes and 5260 down-regulated genes (q <0.05) (Figure 2a, Supplementary Figure 7). In addition, compared with 97% rh, A. avenae differentially expresses hundreds of genes in 85%, 40%, and 0% rh. These data indicate that the global response occurs in the early stages of A. avenae's gradual loss of water, supporting the hypothesis that slow drying may allow the activation of protective mechanisms that limit cell damage. Fifty-eight genes, such as those encoding small heat shock protein 12.6 and kinase, were differentially expressed under all conditions (Figure 2b). Although they are different through up-regulation and down-regulation in the precondition stage, they are both up-regulated at 85% and 40% relative humidity, and down at 0% relative humidity.

Heat map of global differential gene expression in A. avenae during water loss. The color represents the z-score of the gene expression value, which is calculated as the FPKM value centered on the mean. b 58 DGE expression under five conditions. FR: 100% rh, RA: 97% rh, RB: 85% rh, RC: 40% rh, RD: 0% rh.

Next, genome analysis based on the KEGG database (adjusted p ≤ 0.05) (Table 1) identified 37 up-regulated pathways (34%) and 6 down-regulated pathways (6%). In addition, manual analysis based on the pathway studio confirmed a similar adjustment pattern (Figure 3a, supplementary figure 8-22, supplementary explanation). We observed that A. avenae enhanced purine and pyrimidine metabolism and gene expression in DNA repair and replication pathways, indicating that A. avenae successfully survived by using chromosome protection mechanisms to withstand water loss. The ubiquitin-mediated proteolytic pathway is significantly up-regulated to break down proteins, which may provide amino acids for the synthesis of protective molecules. In addition, the TCA cycle, oxidative phosphorylation and electron transfer chain pathways are up-regulated to produce more energy currency ATP, which may participate in the repair and recycling of macromolecules to enter a metabolic state. Most genes in the autophagy pathway are up-regulated, indicating that it may be an effective way to remove damaged cells and aggregated proteins during dehydration. Consistently, up-regulated autophagy pathways have been observed in insects, yeast and resurrected plants28,29,30. In contrast, the Ramazzottius varieornatus genome inhibits autophagy induction2. These data indicate that species have evolved unique protection mechanisms to resist extreme dryness. Interestingly, we found that Notch signaling and endocytosis pathways are upregulated during the drying process. The Notch signaling pathway mediates a variety of cellular processes during development, including cell fate determination in Caenorhabditis elegans31,32. Therefore, activation of Notch signaling requires endocytosis-mediated vesicle trafficking to regulate downstream gene expression. Although the specific functions of Notch signaling and endocytosis pathways involved in the biological regulation of dehydration are still unknown, our data suggest that Notch signaling may determine adaptation to dry cell specialization. Taken together, these findings indicate that A. avenae has undergone a dramatic procedural remodeling to enter a metabolic state.

a Schematic diagram of pathways involved in the dehydration biochemistry of A. avenae. U: Number of up-regulated genes, D: Number of down-regulated genes, T: Total number of DGE genes. b Phylogenetic tree of LEA protein in group 3 in A. avenae, C. elegans and A. thaliana.

In order to reveal the gradual transition of A. avenae into the anhydrous state, we next examined the DGE mode between different relative humidity conditions. 531 (85% rh vs 97% rh), 1261 (40% rh vs 85% rh), and 665 (0% rh vs 40% rh) gene expression were significantly up-regulated, while 359 (85% rh vs 85% rh) gene expression The expression was significantly up-regulated. 97% rh), 1223 (40% rh vs 85% rh) and 976 (0% rh vs 40% rh) genes were significantly down-regulated. The Venn diagram shows that when comparing between 97 and 0% relative humidity conditions, the up- or down-regulated genes rarely overlap (Supplementary Figures 23 and 24). In addition, GO analysis showed that those up-regulated genes were involved in hydrolase activity, transferase activity, and molecular binding activity, while those down-regulated genes played a role in nitrogen compound metabolism, catabolism, and localization (Supplementary Figures 25 and 26). Therefore, with the occurrence of water loss in the body, most of the genes whose expression was up-regulated at 97% rh continued to be expressed at a similar level, while the expression of genes that were further up-regulated in response to extreme dryness were expressed in macromolecules, phosphorus, lipids, and carbohydrates. Metabolism (Supplementary Figure 25). Consistent with the above conclusions, these data indicate that the large-scale remodeling process occurs in the early stages of dehydrated organisms (97% rh) in response to dehydration, and continues to function with additional up-regulated metabolic genes to adapt to extreme dryness.

By manually annotating the top 30 significantly up-regulated proteins during drying, we found that 21 (70%) proteins are new putative proteins with unknown functions. According to the analysis of disordered proteins33, 10 out of 21 new proteins are either completely disordered or have only a few ordered amino acids. It is expected that these intrinsically disordered proteins (IDP) can form an α-helical structure, and therefore may have a chaperone function similar to LEA and acid anhydrides during drying stress. We further determined that 137 proteins are new IDPs that can act as molecular barriers (Supplementary Data 4). Compared with the other seven nematode species, 74 IDPs are species-specific in A. avenae (Supplementary Data 5).

Like IDP, LEA protein may form a gel phase to protect the organism from water loss18,20. Here, we identified 15 Group 3 LEA proteins in the A. avenae genome (Figure 3b, Supplementary Note, Supplementary Figure 27). During the drying process, they were all significantly up-regulated, and two of them were listed as the top 10 significantly up-regulated proteins. The adjacent connection phylogenetic tree shows that fifteen A. avenae LEAs show unique protein sequences and are closely clustered together (Figure 3b). Compared with A. avenae and C. elegans, the inferred LEA phylogenetic relationship between A. avenae and plants is closer (Figure 3b). These data indicate that scattered repetitions have occurred in the LEA family of A. avenae, promoting survival in complete dryness. In addition, the horizontal gene transfer of ancient LEA may occur between A. avenae and plants.

The HSP70 family is another family that shows surprising gene family expansion in the A. avenae genome. In comparison, there are 6 copies in Drosophila, 22-32 in tapeworms, 18 in Arabidopsis, and humans. There are two 34 in it. 50 HSP70 proteins were identified by sequence homologs in A.avenae, 40 of which have expression values ​​in our transcriptome. WGD replicated six A. avenae HSP70 genes (12%). HSP70 from A. avenae and other nematodes based on the maximum likelihood phylogenetic tree showed that most A. avenae HSP70 genes are species-specific (Supplementary Figure 28). During dry stress, 13 HSP70 genes were significantly up-regulated and 3 genes were significantly down-regulated (Supplementary Figure 28). These differentially expressed HSP70 genes are not clustered together in the phylogenetic tree, indicating the sequence diversity of the anti-drying HSP70 protein.

Genome analysis identified a significant number of 767 conventional protein kinases (EPK) and 19 other atypical protein kinases (APK) in A. avenae (Supplementary Table 10), representing the second largest report after Haemonchus contortus Metazoan Kinase Group. Kinome amplification occurs in the AGC group (cAMP-dependent, cGMP-dependent, and protein kinase C) and two APK groups: phosphatidylinositol 3-kinase-related kinase (PIKK) and "correct open reading frame" (RIO) (Supplementary explanation). Transcriptome analysis showed that there are 230 up-regulated and 126 down-regulated kinase genes at 97% rh compared to 100% rh conditions (Supplementary Data 7), which means that the phosphorylation network senses water loss and transmits signals to regulate functional gene expression . It is reported that overexpression of the plant homolog of yeast AGC kinase Dbf2 enhances the drought tolerance of yeast and transgenic plant cells. Therefore, the expansion of the A. avenae kinase group may be due to evolutionary adaptation to repeated dehydration.

In this study, the genome and transcriptome of A. avenae provided molecular insights into the survival of metazoans under extreme dry conditions. The recent genome-wide duplication may be the cause of the unexpectedly large genome size of A. avenae, which is conducive to A. avenae's adaptation to repeated dehydration and rehydration cycles over millions of years. The expanded and diversified kinase group represents the second largest metazoan kinase group, which may play a role in clear signal transduction in response to extreme dryness stress. We found that A. avenae reprogrammed its overall metabolism to enter a dehydrated biochemical state. In addition, 74 species-specific IDPs were identified as potential molecular barriers for extreme desiccation resistance. Further functional characterization of these protective molecules will provide insights for dry survival.

Aphelenchus avenae is raised on the plant pathogenic fungus Rhizoctonia solani, which is raised on a substrate of autoclaved wheat. The nematodes are washed with distilled water, collected on a filter, and then harvested with a 30% sucrose float method to remove contaminants. An aliquot of the nematode is stored at -80°C. To isolate genomic DNA, the nematodes are lysed with a lysis buffer containing 0.1 M Tris-Cl pH 8.5, 0.1 M NaCl, 50 mM ethylenediaminetetraacetic acid (EDTA) pH 8.0, 1% sodium dodecyl sulfate (SDS) , Treated with proteinase K. Then the phenol/chloroform extraction method was carried out. The 18 S ribosomal RNA gene was amplified by polymerase chain reaction (PCR) and identified by Sanger sequencing on the Applied Biosystems (ABI) 3730 XL DNA analyzer. No contamination was found in the 18S ribosomal RNA gene sequencing results.

The Roche 454 shotgun library and 8-kb and 20-kb span paired-end libraries were prepared according to the Roche 454 manual and sequenced on the Roche GS FLX titanium platform at the University of Hawaii. The Illumina 500 bp paired-end library was prepared according to the Illumina manual and sequenced on the Illumina Hiseq 2000 platform of the Yale University Genome Analysis Center (YCGA).

Use Trimmomatic v0.3637 to scan and trim Illumina readings to remove poor quality bases. Since the sequencing error generated by the Illumina platform was 0.5–2.58, the Illumina readings were then corrected by Quake v0.2, which uses the maximum likelihood method and parallel K-mer counting method39. Both the flow cytometry method and the K-mer frequency distribution method are used to estimate the genome size (Supplementary Note). The genome assembly reported here was performed using SOAPdenovo v2 and Newbler v2.8, and then Illumina paired-end reads were mapped to tight gaps (Supplementary Note). The accuracy of the assembly is evaluated by the SOAPaligner v1 package. The core eukaryotic gene mapping method (CEGMA v2.5) and BUSCO v5.0.021 (Supplementary Note) are used to evaluate the integrity of the assembly.

A. avenae samples were prepared at 100, 97, 85, 40, and 0% relative humidity (RH), and each condition contained three biological replicates. All samples prepared for dry stress conditions were pretreated at 97% relative humidity (rh) for 72 hours. For each condition, more than 1 mg of mixed stage nematodes were collected. Ten times the volume of Trizol was added to each sample. The sample was vortexed briefly, frozen in a dry ice-ethanol bath, and stored at -80 °C until separation. For total RNA isolation, the nematode sample was thawed at 37°C, then placed in a dry ice-ethanol bath, and repeated 6 times. Then place the nematode sample on ice for 30 s, vortex for 30 s, and repeat 6 times. After the nematode sample was kept at room temperature for 5 minutes, 2 ml of chloroform was added per ml. Shake the sample for 15 seconds, then let it stand at room temperature for 2-3 minutes. The sample was centrifuged at 12,000 × g for 15 minutes at 4 °C. Transfer the aqueous phase to a new tube, add 1 volume of isopropanol, and incubate at room temperature for 10 minutes. The RNA was precipitated at 12,000 × g for 10 minutes. The RNA pellet was washed with 75% ethanol and centrifuged at 7,500 × g for 5 minutes. Remove the ethanol, and dissolve the RNA precipitate in water treated with diethyl pyrocarbonate (DEPC). The RNA quality is evaluated by nanodrop and Agilent 2100 bioanalyzer with RNA Pico chip.

Isolate messenger RNA (mRNA) and prepare a cDNA library according to the Illumina mRNA sequencing manual. Link a barcode containing 6 bp to each sample, and pool 15 samples together​​ for Illumina 75 bp paired-end sequencing on Hiseq 2000 and 2500 platforms. A total of 492 million double-ended readings were generated. The detailed summary of the reading is listed in Supplementary Table 5.

To identify the de novo repeat family in the A. avenae genome, we used RepeatModeler v1.0340, combined with de novo repeat finder RECON v1.0541, de novo repeat finder RepeatScout v1.0542, tandem repeat finder TRF v4.0943 and NCBI RMBLAST v2. 2.27. Run the transposon PSI v20100822 through PSI-TBLASTN to search the assembled scaffold to identify the transposon family in the A. avenae genome. By combining the Repeat library with the output from RepeatModeler and the transposon PSI, RepeatMasker v4.06 program 40 was used to identify 16.7% of the A. avenae genome as repeats and transposon regions. The details of the repeating families are listed in Supplementary Table 3.

Two sets of data are prepared to train AUGUSTUS v2.5.544 for gene model prediction in the A. avenae genome. One set was generated based on transcripts verified by PASA v2.0.2, and the other set was generated from the eukaryotic orthologous group (KOG) identified in the A. avenae genome by CEGMA v2.5. Evaluation of the accuracy report shows that AUGUSTUS trained by the KOG set produces better accuracy results at the nucleotide, exon, and gene levels (Supplementary Table 6). Therefore, we use the KOG set of training parameters to predict the genetic model of AUGUSTUS. We further used the A. avenae KOG set to train SNAP v20131129 and GlimmerHMM v3.0245 for gene model prediction. GeneMark-ES46 is trained by A. avenae genome sequence, because GeneMark-ES v4.38 uses self-training algorithm to calculate the species-specific parameters of gene prediction from the DNA sequence itself, without additional species-specific transcription information for training. Then combine the four gene sets predicted by the ab initio program and the high-quality transcripts verified by the PASA pipeline to generate the final gene set47. Operon analysis is performed using an internal perl script located on the same chain to count genes, and the intergenic region ranges from 25 to 1000 bases (Supplementary Note).

To identify putative orthologs (COG) clusters, we implemented the standalone OrthoMCL v2.0.9 software48, which uses Markov clustering algorithm and groups proteins based on sequence similarity. OrthoMCL first removes low-quality sequences. All 43,192 A. avenae proteins passed the standard and are reserved for downstream steps. By using NCBI BLASTP with 1e-5 as the e-value cut-off value, the proteome set was analyzed with full ratio and full BLASTP. Then load the all-vs-all BLASTP output to generate a MySQL database. The putative orthologs with the best similarity scores were identified between species, while the paralogs with better similarity scores were identified within the species.

Downloaded from the worm library website (https://wormbase.org/) with the WS233 version from Caenorhabditis elegans, Bifidobacterium malaysia, M. hapla, A. suum, Bursaphelenchus xylophilus, Twisted Protein sequence.

To construct a phylogenetic tree of eight nematode species, 168 single-copy ortholog clusters from the eight nematode genomes were aligned by MAFFT v7.046b program 49, and then connected by RAxML v 8.1.3 package for tree estimation . T.spiralis is set to leave the group. Apply the gamma model of rate heterogeneity and the DAYHOFF model of amino acid substitution matrix. Then perform 1000 quick guided inferences for large-scale maximum likelihood analysis. Analyze gene gain/loss through Dollo of PHYLIP package v3.69551 and polymorphism reduction program (Dollop). The detailed method of the phylogenetic stress of LEA and HSP70 proteins appears in the supplementary notes.

In order to identify paralogs in the A. avenae genome, we performed full and full BLASTP analysis on 43,192 gene models. In order to further examine the pairwise collinear relationship and other types of gene duplication, we executed the MCScanX program 52 to scan the NCBI BLASTP output and cluster collinear regions. The collinear relationship among the largest 23 brackets is visualized by Circos v0.6453. The list of collinear genes is shown in Supplementary Data 3. Analyze the collinear relationship through SyMAP 4.254,55, and then present it by Circos v0.64.

RNA-seq reads are mapped to the assembled scaffold, and the exons and possible splice junctions are identified by TopHat256 through bowtie2 v2.1.057. TopHat2 collects initial mapping and potential exon information to create a database of possible splice junctions. Then divide all input reads into smaller fragments and map them independently again to avoid losing exons smaller than 100 bp. TopHat2 brings together smaller segment alignment information and generates end-to-end read alignment in the last step. Then use Cufflinks258,59 to assemble TopHat2 output into transcripts. The assembled transcripts are re-aligned with the genome assembly draft by the genome mapping and comparison program (GMAP), and verified by the assembly splicing comparison (PASA) program. Use Cuffdiff for comparative transcriptome analysis. Further statistical analysis is performed by the R package CummeRbund v2.33.060. The heat map of DGE is generated by Cluster 3.0 and visualized by Java TreeView v1.1.6. We further determined the Kyoto Encyclopedia of Genes and Genomics (KEGG) pathway and response group pathway by running the locally installed KOBAS 2.061. For gene set analysis (GSA), we used R package piano v2.6.062.

Use Cufflinks v2, CummeRbund v2.33.0 and Piano v2.6.0 for statistical analysis. A P value of <0.05 was considered significant. For multiple hypothesis testing, an adjusted p-value of <0.05 is considered significant. Three biological replicates were performed for each condition.

For more information on the research design, please see the abstract of the nature research report linked to this article.

The supplementary data needed to generate the images and interpret the results of this study is publicly available in Figshare at https://doi.org/10.6084/m9.figshare.1664029363. The sequencing data is stored in the Sequence Read Archive database (PRJNA236621, PRJNA236622). The genome assembly with gene annotation is stored under PRJNA236621. All data can be found in the manuscript or supplementary materials.

A correction to this article has been published: https://doi.org/10.1038/s42003-021-02824-5

John, AB etc. Dehydration specifically induces the hydrophilic protein gene in the dehydrated biological nematode Aphelenchus avenae. Eukaryotes. Cell 3, 966–975 (2004).

Flott, J.-F. Wait. Genomic evidence for meiotic evolution in the bdelloid rotifer Adineta vaga. Nature 500, 453–457 (2013).

CAS PubMed PubMed Central Google Scholar 

Hashimoto, T. etc. Extremely tolerant tardigrade animal genome and the unique proteins of tardigrade improve the radiotolerance of human cultured cells. Nat. Community. 7, 12808 (2016).

CAS PubMed PubMed Central Google Scholar 

Koutsovoulos, G. etc. There is no evidence that there is widespread horizontal gene transfer in the genome of the tardigrade Hypsibius dujardini. Process National Academy of Sciences. science. United States 113, 5053-5058 (2016).

CAS PubMed PubMed Central Google Scholar 

Gladyshev, EA, Meselson, M. & Arkhipova, IR bdelloid Large-scale horizontal gene transfer in rotifers. Science 320, 1210–1213 (2008).

Yoshida, Y. et al. Comparative genomics of tardigrades Hypsibius dujardini and Ramazzottius varieornatus. Public Science Library Biology. 15, e2002266 (2017).

PubMed PubMed Central Google Scholar 

Boothyby, TC, etc. Tardigrades use essentially disordered proteins to survive dryness. Mole. Cell 65, 975–984 (2017).

Ma Jin, PV etc. Common choice for the heat shock regulation system of Polypedilum vanderplanki dehydrated organisms for sleeping midge. Process National Academy of Sciences. science. United States 115, E2477–E2486 (2018).

CAS PubMed PubMed Central Google Scholar 

Belott, C., Janis, B. & Menze, MA Liquid-liquid separation promotes animal desiccation tolerance. Process National Academy of Sciences. science. United States 117, 27676-27684 (2020).

CAS PubMed PubMed Central Google Scholar 

Ryabova, A. etc. The combined metabolome and transcriptome analysis revealed a key component of the complete desiccation tolerance of anhydrous insects. Process National Academy of Sciences. science. United States 117, 19209–19220 (2020).

CAS PubMed PubMed Central Google Scholar 

Tyson, T. et al. Molecular analysis of the desiccation tolerance mechanism of the dehydrated biological nematode Panagrolaimus superbus using expressed sequence tags. BMC Resource Note 5, 68 (2012).

CAS PubMed PubMed Central Google Scholar 

Adhikari, BN, Wall, DH & Adams, BJ Antarctic nematode dry survival rate: molecular analysis using expressed sequence tags. BMC Genomics 10, 69 (2009).

PubMed PubMed Central Google Scholar 

Adhikari, BN, Wall, DH & Adams, BJ Effects of slow drying and freezing on gene transcription and stress survival of Antarctic nematodes. J. Experience. biology. 213, 1803–1812 (2010).

Wharton, DA, Goodall, G. & Marshall, CJ Freezing rate affects the survival of Panagrolaimus davidi under short-term freezing stress, which is an Antarctic nematode that can survive intracellular freezing. Frozen Wright. 23, 5-10 (2002).

Wharton, DA, Downes, MF, Goodall, G. & Marshall, CJ Antarctic nematode (Panagrolaimus davidi) freezing and cryoprotective dehydration are visualized using cryo-substitution technology. Cryobiology 50, 21-28 (2005).

Wharton, DA & Raymond, MR The cold tolerance of Antarctic nematodes Plectus murrayi and Scottnema lindsayae. J. Comp. Physiology. B 185, 281–289 (2015).

Xue, X., Suvorov, A., Fujimoto, S., Dilman, AR & Adams, BJ Genome analysis of Plectus murrayi, a nematode from the continent of Antarctica. G3.11, jkaa045 (2021).

Browne, J., Tunnacliffe, A. & Burnell, A. Plant desiccation genes found in nematodes. Nature 416, 38 (2002).

Higa, LM and Womersley, CZ's new insights on anhydrous biological phenomena: the impact of differences in trehalose content and evaporative water loss on the survival of Aphelenchus avenae. J. Experience. zoo. 267, 120–129 (1993).

Reardon, W. et al. Expression profile and cross-species RNA interference (RNAi) of desiccation-induced transcripts in the dehydrated biological nematode Aphelenchus avenae. BMC Moore. biology. 11, 6 (2010).

PubMed PubMed Central Google Scholar 

Simao, FA, Waterhouse, RM, Ioannidis, P., Kriventseva, EV & Zdobnov, EM BUSCO: Use single-copy orthologs to assess genome assembly and annotation integrity. Bioinformatics 31, 3210–3212 (2015).

Allen, MA, Hillier, L., Waterston, RH & Blumenthal, T. Global analysis of Caenorhabditis elegans trans-splicing. Genome research. 21, 255–264 (2011).

CAS PubMed PubMed Central Google Scholar 

Mitreva, M. et al. Draft genome of the parasitic nematode Trichinella​​​ Nat. Gene. 43, 228–235 (2011).

CAS PubMed PubMed Central Google Scholar 

Gu, Z., Cavalcanti, A., Chen, F.-C., Bouman, P. & Li, W.-H. The degree of gene duplication in the genomes of fruit flies, nematodes and yeast. Mole. biology. evolution. 19, 256–262 (2002).

Cavalcanti, ARO, Ferreira, R., Gu, Z. & Li, W.-H. Gene duplication patterns of Saccharomyces cerevisiae and Caenorhabditis elegans. J. Moore. evolution. 56, 28-37 (2003).

Wharton, DA in The Biology of Nematodes (ed. Lee DL) 389–411 (Taylor & Francis, 2002).

Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M. & Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic acid research. 44, D457–D462 (2016).

Teets, NM, etc. Changes in gene expression that control extreme dehydration tolerance in Antarctic insects. Process National Academy of Sciences. science. United States 109, 20744–20749 (2012).

CAS PubMed PubMed Central Google Scholar 

Ratnakumar, S. et al. Phenotypic and transcriptomics analysis shows that autophagy plays a major role in the desiccation tolerance of Saccharomyces cerevisiae. Mole. Biological system. 7, 139–149 (2011).

Liu, J., Moyankova, D., Djilianov, D. & Deng, X. Common and specific mechanisms of drought tolerance in two resurrected plants of Gesneriaceae. Multiple omics evidence. front. Plant science. 10, 1067 (2019).

PubMed PubMed Central Google Scholar 

Greenwald, I. LIN-12/Notch Signaling: Lessons from Worms and Flies. Gene Development 12, 1751–1762 (1998).

Greenwald, I. LIN-12/Notch signaling in Caenorhabditis elegans. Worm book https://doi.org/10.1895/Wormbook (2005).

Dosztányi, Z., Csizmok, V., Tompa, P. & Simon, I. IUPred: A web server that predicts the essential unstructured regions of protein based on estimated energy content. Bioinformatics 21, 3433-3434 (2005).

Lin, BL etc. Genomic analysis of the Hsp70 superfamily in Arabidopsis. Cellular Stress Companion 6, 201–208 (2001).

CAS PubMed PubMed Central Google Scholar 

Schwartz, EM etc. The genome and developmental transcriptome of Strongyloides Haemonchus contortus. Genomic biology. 14, R89 (2013).

PubMed PubMed Central Google Scholar 

Lee, JH, van Montagu, M. and Verbruggen, N. Highly conserved kinases are an important part of stress tolerance in yeast and plant cells. Process National Academy of Sciences. science. United States 96, 5873–5877 (1999).

CAS PubMed PubMed Central Google Scholar 

Bolger, AM, Lohse, M. and Usadel, B. Trimmomatic: Flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

CAS PubMed PubMed Central Google Scholar 

Kelley, D., Schatz, M. and Salzberg, S. Quake: Quality perception detection and sequencing error correction. Genomic biology. 11, R116 (2010).

CAS PubMed PubMed Central Google Scholar 

Marcais, G. & Kingsford, C. A fast, lock-free method for efficient parallel counting of the emergence of k-mers. Bioinformatics 27, 764–770 (2011).

CAS PubMed PubMed Central Google Scholar 

Smit, AFA & Hubley, R. RepeatModeler Open-1.0. 2008-2015 http://www.repeatmasker.org (2008-2015).

Bao, Z. and Eddy, SR Automated de novo identification of repetitive sequence families in sequenced genomes. Genome research. 12, 1269–1276 (2002).

CAS PubMed PubMed Central Google Scholar 

De novo identification of repeated families in Price, AL, Jones, NC and Pevzner, PA large genomes. Bioinformatics 21, i351–i358 (2005).

Benson, G. Tandem Repetitive Sequence Finder: A program for analyzing DNA sequences. Nucleic acid research. 27, 573–580 (1999).

CAS PubMed PubMed Central Google Scholar 

Stanke, M., Schöffmann, O., Morgenstern, B. & Waack, S. Gene prediction in eukaryotes, using generalized hidden Markov models hinted by external sources. BMC Bioinformatics 7, 62 (2006).

PubMed PubMed Central Google Scholar 

Majoros, WH, Pertea, M. & Salzberg, SL TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene finder. Bioinformatics 20, 2878–2879 (2004).

Lomsadze, A. etc. Gene recognition in new eukaryotic genomes through self-training algorithms. Nucleic acid research. 33, 6494–6506 (2005).

CAS PubMed PubMed Central Google Scholar 

Haas, BJ etc. Automatic eukaryotic gene structure annotation using EVidenceModeler and assembly splicing alignment program. Genomic biology. 9, R7 (2008).

PubMed PubMed Central Google Scholar 

Li, L., Stoeckert, CJ & Roos, DS OrthoMCL: Identification of the orthologous group of eukaryotic genomes. Genome research. 13, 2178 (2003).

CAS PubMed PubMed Central Google Scholar 

Katoh, K. & Standley, DM MAFFT Multiple Sequence Alignment Software Version 7: Performance and Usability Improvements. Mole. biology. evolution. 30, 772–780 (2013).

CAS PubMed PubMed Central Google Scholar 

Stamatakis, A. RAxML Version 8: A tool for phylogenetic analysis and post-analysis of large-scale phylogeny. Bioinformatics 30, 1312–1313 (2014).

CAS PubMed PubMed Central Google Scholar 

Felsenstein, J. PHYLIP-Phylogenetic Reasoning Package (version 3.2). Branches 5, 164–166 (1989).

Wang, Y. etc. MCScanX: A toolkit for detecting and evolutionary analysis of gene homolinearity and collinearity. Nucleic acid research. 40, e49 (2012).

CAS PubMed PubMed Central Google Scholar 

Krzywinski, M. et al. Circos: Information Aesthetics in Comparative Genomics. Genome research. 19, 1639-1645 (2009).

CAS PubMed PubMed Central Google Scholar 

Soderlund, C., Nelson, W., Shoemaker, A. & Paterson, A. SyMAP: A system for discovering and viewing the same line area on FPC maps. Genome research. 16, 1159–1168 (2006).

CAS PubMed PubMed Central Google Scholar 

Soderlund, C., Bomhoff, M. and Nelson, W. SyMAP v3.4: A turnkey synlinear system for plant genomes. Nucleic acid research. 39, e68 (2011).

CAS PubMed PubMed Central Google Scholar 

Kim, D. et al. 2013. TopHat2: Accurately align the transcriptome in the presence of insertions, deletions and gene fusions. Genomic biology. 14, R36 (2011).

Langmead, B., Trapnell, C., Pop, M. & Salzberg, SL Alignment of ultra-fast and memory-efficient short DNA sequences with the human genome. Genomic biology. 10. R25 (2009).

PubMed PubMed Central Google Scholar 

Trapnell, C. etc. RNA-Seq transcript assembly and quantification revealed unannotated transcripts and subtype transitions during cell differentiation. Nat. Biotechnology. 28, 511–515 (2010).

CAS PubMed PubMed Central Google Scholar 

Trapnell, C. etc. Use RNA-seq to perform differential analysis of gene regulation at transcript resolution. Nat. Biotechnology. 31, 46-53 (2013).

Goff, L., Trapnell, C. and Kelley, D. cummeRbund: Analysis, exploration, manipulation and visualization of Cufflinks high-throughput sequencing data. R package version 2.22.0 https://bioconductor.org/packages/cummeRbund/ (2013).

Thanks, C. etc. KOBAS 2.0: A web server for annotating and identifying rich pathways and diseases. Nucleic acid research. 39, W316–W322 (2011).

CAS PubMed PubMed Central Google Scholar 

Väremo, L., Nielsen, J. and Nookaew, I. The gene set analysis of the whole genome data is enriched by combining the directionality of gene expression and combining statistical assumptions and methods. Nucleic acid research. 41, 4378–4391 (2013).

PubMed PubMed Central Google Scholar 

Wan, X. Aphelenchus avenae genome source data. Figshare dataset https://doi.org/10.6084/m9.figshare.16640293.v1 (2021).

This work was supported by advanced research in genomics, proteomics, and bioinformatics at the University of Hawaii (MA). The USDA is an equal opportunity employer. The trade names or commercial products mentioned in this publication are only used to provide specific information and do not imply a recommendation or endorsement by the USDA.

Advanced Research in Genomics, Proteomics and Bioinformatics, University of Honolulu, Hawaii

Xuehua Wan, Jennifer A. Saito, Shaobin Hou & Maqsudul Alam

TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin, China

Tropical Crops and Commodity Conservation Research Group, ARS Pacific Basin Agricultural Research Center, United States Department of Agriculture, Hilo, Hawaii, U.S.

Elsevier Life Sciences Solutions, Rockville, Maryland, USA

College of Life Sciences, University of Hawaii, Honolulu, Hawaii, USA

Lynne M. Higa & Christopher Z. Womersley

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

XW, JAS, and MA designed the study and wrote the manuscript. XW, SH, LMH, and CZW conducted experiments, and XW, JAS, SMG, and AY analyzed data. MA gets funding.

Correspondence with Xuehua Wan or Christopher Z. Womersley.

The author declares no competing interests.

Peer review information Communications Biology thanks Philipp Schiffer and other anonymous reviewers for their contributions to the peer review of this work. Main processing editor: Caitlin Karniski. Peer review reports are available.

The publisher states that Springer Nature remains neutral on the jurisdiction claims of published maps and agency affiliates.

Open Access This article has been licensed under the Creative Commons Attribution 4.0 International License Agreement, which permits use, sharing, adaptation, distribution and reproduction in any media or format, as long as you appropriately indicate the original author and source, and provide a link to the Creative Commons license , And indicate whether any changes have been made. The images or other third-party materials in this article are included in the article’s Creative Commons license, unless otherwise stated in the material’s credit line. If the article’s Creative Commons license does not include the material, and your intended use is not permitted by laws and regulations or exceeds the permitted use, you need to obtain permission directly from the copyright owner. To view a copy of this license, please visit http://creativecommons.org/licenses/by/4.0/.

Wan, X., Saito, JA, Hou, S. etc. The Aphelenchus avenae genome highlights the evolutionary adaptation to desiccation. Public Biology 4, 1232 (2021). https://doi.org/10.1038/s42003-021-02778-8

DOI: https://doi.org/10.1038/s42003-021-02778-8

Anyone you share the following link with can read this content:

Sorry, there is currently no shareable link in this article.

Provided by Springer Nature SharedIt content sharing program

By submitting a comment, you agree to abide by our terms and community guidelines. If you find content that is abusive or does not comply with our terms or guidelines, please mark it as inappropriate.

Commun Biol ISSN 2399-3642 (online)